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Abstract 

In this paper we study interactive “one-shot” analogues of the classical 
Slepian-Wolf theorem. Alice receives a value of a random variable X, Bob 
receives a value of another random variable Y that is jointly distributed 
with X. Alice’s goal is to transmit X to Bob (with some error probability 
e). Instead of one-way transmission, which is studied in the classical 
coding theory, we allow them to interact. They may also use shared 
randomness. 

We show, that Alice can transmit X to Bob in expected H{X\Y) + 
2-y/ H{X\Y) + 0(log2 ( j)) number of bits. Moreover, we show that every 
one-round protocol n with information complexity I can be compressed to 
the (many-round) protocol with expected communication about I + 2y/l 
bits. This improves a result by Braverman and Rao [3], where they had 
5VT. Further, we show how to solve this problem (transmitting X) using 
3H{X\Y) + 0(log2 (p) bits and 4 rounds on average. This improves a 
result of [3], where they had 4:H{X\Y) -|-0(log 1/e) bits and 10 rounds on 
average. 

In the end of the paper we discuss how many bits Alice and Bob may 
need to communicate on average besides H{X\Y). The main question 
is whether the upper bounds mentioned above are tight. We provide an 
example of {X,Y), such that transmission of X from Alice to Bob with 
error probability e requires H{X\Y) -\- Q. (log 2 (j)) bits on average. 


1 Introduction 

Assume that Alice receives a value of a random variable X and she wants to 
transmit that value to Bob. It is well-known ([5]) that Alice can do it using one 
message over the binary alphabet of expected length less than H{X)-\-\. Assume 
now that there are n independent random variables Xi ,..., A„ distributed as 
X, and Alice wants to transmit all Xi, ..., A„ to Bob. Another classical result 
from [5] states, that Alice can do it using one message of fixed length, namely 
Ri nH{X), with a small probability of error. 
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One of the possible ways to generalize this problem is to provide Bob with a 
value of another random variable V which is jointly distributed with X. That 
is, to let Bob know some partial information about X for free. This problem is 
the subject of the classical Slepian-Wolf Theorem [S] which asserts that if there 
are n independent pairs {Xi, Yi),..., Y„), each pair distributed exactly as 

(X, Y), then Alice can transmit all Xi,..., X„ to Bob, who knows Yi,..., Y„, 
using one message of fixed length, namely « nH (X| Y), with a small probability 
of error. However, it turns out that a one-shot analogue of this theorem is 
impossible, if only one-way communication is allowed. 

The situation is quite different, if we allow Alice and Bob to interact^ that 
is, to send messages in both directions. In [7] Orlitsky studied this problem for 
the average-case communication when no error is allowed. He showed that if 
pair (X, Y) is uniformly distributed on it’s support, then Alice may transmit X 
to Bob using at most 


H{X\Y) + 3log2{H{X\Y) + 1) + 17 

bits on average and 4 rounds. For the pairs (X, Y) whose support is a Cartesian 
product Orlitsky showed that error-less transmission of X from Alice to Bob 
requires H{X) bits on average. 

From a result of Braverman and Rao ([3]), it follows that for arbitrary (X, Y) 
it is sufficient to communicate at most 

H{X\Y) + 5VH{X\Y) + O (^log2 

bits on average (here e stands for the error probability). 

We improve this result, showing that Alice may transmit X to Bob with 
error probability at most e (for each pair of inputs) using at most 

H{X\Y) + 2^H{X\Y) + O (^log2 Q ) 

bits on average and 0{y^H{X\Y)) rounds. Our protocol is inspired by protocol 
from [T]. The idea of the protocol is essentially the same, we only apply some 
technical trick to reduce communication. 

Actually, in [5] a more general result was established. It was shown there that 
every one-round protocol tt with information complexity I can be compressed 
to the (many-round) protocol with expected length at most 

^I + bVl. ( 1 ) 

Using the result from [3], we improve [TJ Namely, we show that every one-round 
protocol TT with information complexity I can be compressed to the (many- 
round) protocol with expected communication length at most 

^1 + 2V1. 
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In [3], it is established a one-shot interactive analogue of the Slepian-Wolf 
theorem for the bounded-round communication. They showed that Alice may 
transmit X to Bob using at most 0{H{X\Y) -\- 1) bits and 0(1) rounds on aver¬ 
age. More specifically, their protocol transmits at most 47J(A'|F) -|-log 2 (l/e) -I- 
0(1) bits on average in 10 rounds on average. In this paper, we provide another 
proof of this result, which seems to be easier. More specifically, we show that it 
is sufficient to communicate at most 

3ii(x|y) + iog2 +0(1) 

bits on average in at most 4 rounds on average. 

From the proof of our upper bound it follows that there exists a deterministic 
protocol which transmits X from Alice to Bob using the same number of bits on 
average (namely H{X\Y) + 2^H{X\Y) + O (log 2 (^))) and which guaranties 
that for at most e-fraction of inputs (with respect to the distribution of (AT, +)) 
the transmission is incorrect. Are there random variables X^Y for which the 
corresponding upper bound is tight? We make a step towards answering this 
question: we provide an example of random variables Ai, + such that every de¬ 
terministic protocol which transmits X from Alice to Bob with error probability 
£ must communicate at least H{X\Y) + 11 (log 2 (-)) bits on average. 

In the Appendix we provide an example of {X, Y) for which it seems plausible 
that the upper bound H[X\Y) + 0{^H{X\Y)) is tight. 

2 Definitions 

We will denote the set of the first n naturals {1, 2,..., n} by [n]. 

2.1 Information Theory 

Let X, Y be two joint distributed random variables, taking values in the finite 
sets, respectively, X and y. 

Definition 2.1. Shannon Entropy of X is defined by the formula 

Definition 2.2. Conditional Shannon entropy of X with respect to Y is defined 
by the formula: 

H{X\Y) = H{X\Y = y) Pr[y = yf 
v^y 

where X\Y = y denotes a distribution of X, conditioned on the event {Y = y}. 

If X is uniformly distributed in X then obviously H{X) = log 2 (|A’|). We 
will also use the fact that the formula for conditional entropy may be re-written 
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as 


1 


H(X\Y)= Y. Pr[^=x,r = y]log2 

{x,v)GXxy 


\Py[X = x\Y = y] 

Generalization of the Shannon entropy is Renyi entropy. 


Definition 2.3. Renyi entropy of X is defined by the formula 


il2(X) = -log2 ^Pr[X = x]2 . 

VxgA’ / 

Concavity of log implies that H{X) > H 2 {X). 

The mutual information of two random variables X and Y, conditioned on 
another random variable Z, can be defined as: 

I{X : Y\Z) = H{X\Z) - H{X\Y, Z). 


For the further introduction in information theory see, for example m- 

2.2 Communication Protocols 

Assume that we are given jointly distributed random variables X and Y, taking 
values in finite sets X and y. Let R, Ra, Rb be a random variables, taking 
values in finite sets TZ, TZa and TZb, such that {X,Y), R, Ra, Rb are mutually 
independent. 

Definition 2.4. A randomized communication protocol is a rooted binary tree, 
in which each non-leaf vertex is associated either with Alice or with Bob. For 
each non-leaf vertex V associated with Alice there is a function fy : XxTZxTZa — t 
{0,1} and for each non-leaf vertex u associated with Bob there is a function 
Pu : y xTZx TZb —t {0,1}. For each non-leaf vertex one of an out-going edges is 
labeled by 0 and other is labeled by 1. Finally, for each leaf I there is a function 
<f>i -.y xTZx TZb O, where O denotes the set of all possible Bob’s outputs. 

A computation according to a protocol runs as follows. Alice is given x £ X, 
Bob is given y £ y. Assume that the random variables R takes a value r, Ra 
takes a value Va and Rb takes a value rb. Alice and Bob start at the root of 
the tree. If they are in the non-leaf vertex v associated with Alice, then Alice 
sends fv{x, r, ra) to Bob and they go by the edge labeled by fv{x, r, ra). If they 
are in a non-leaf vertex associated with Bob then Bob sends r, rb) to Alice 
and they go by the edge labeled by gy{y,r,rb). When they reach a leaf I Bob 
outputs the result 4>i[y,r,rb). 

A protocol is called public-coin if fv,gu and fii do not depend on the values 
of Ra, Rb- 

A protocol is called deterministic if fv,gu and fii do not depend on the values 
of R, Ra, Rb- 

We distinguish between average-case communication complexity and the 
worst-case communication complexity. 
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Definition 2.5. The (worst-case) communication complexity of a protocol tt, 
denoted by CC{t:), is defined as the depth of the corresponding binary tree. 

We say that protocol tt communicates d bits on average (or expected length 
of the protocol is equal to d), if the expected depth of the leaf that Alice and Bob 
reach during the execution of the protocol tt is equal to d, where the expectation 
is taken over X, Y, R, Ra, Rb- 

If the Alice’s goal is to transmit X to Bob, then in the end of the commu¬ 
nication Bob should output some element of X (that is, O = X). We say that 
protocol transmits X from Alice to Bob with error probability e if 


Pr[X = ()LiY,R,RB)] > 1-e, 

where L denotes the leaf that Alice and Bob reach in the protocol tree. 

For the worst-case communication it is sufficient to consider only determin¬ 
istic protocols. Indeed, assume that we are given a randomized protocol solving 
our problem with error probability e. Fix the value of R for which error prob¬ 
ability is minimal. In this way we obtain a protocol with the same worst-case 
communication complexity and error probability. 

For the further introduction in Communication Complexity see [5] 


3 Near-optimal one-shot Slepian-Wolf theorem 

Consider the following auxiliary problem. Let A be a finite set. Assume that 
Alice receives an arbitrary a £ A and Bob receives and arbitrary probability 
distribution p, on A. Alice wants to communicate a to Bob in about log(l/^(a)) 
bits with small probability of error. 

Lemma 3.1. Let e be a positive real and h a positive integer. There exists a 
public coin randomized communication protocol such that for all a in the support 
of pL the following hold: 

• in the end of the communication Bob outputs b G A which is equal to a 
with probability at least 1 — e; 

• the protocol communicates at most 


log2 




+ h-\- log2 



+ 0 ( 1 ) 


bits, regardless of the randomness. 

Proof. Alice and Bob interpret each portion of |A| consecutive bits from the 
public randomness source as a table of a random function h : A ^ {0,1}. That 
is, we will think that they have access to a large enough family of mutually 
independent random functions of the type A —>■ {0,1}. Those functions will be 
called ha.sh functions and their values hash values below. 
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The first set k = [log 2 (^)] +1. Then Bob sets: 

Si = {a: e A I n{x) e 2“*]} . 


Then Alice and Bob work in stages numbered 0,1,.... 

On Stage 0: 

1. Alice sends k hash values of a to Bob. 

2. Bob computes set S'q, which consists of all elements from Sq that have the 
same hash values as sent by Alice (actually Sq has at most one element). 

3. If ^ 0, then Bob sends 1 to Alice, outputs any element of Sq and they 
terminate. Otherwise Bob sends 0 to Alice and they proceed of Stage 1. 

On Stage t: 

1. Alice sends h new hash values of a to Bob so that the total number of 
hash values of a available to Bob be A: + ht. 

2. For each i G {h{t — 1) + 1,..., ht} Bob computes set 5', which consists of 
all elements from Si, which agree with all Alice’s hash values. 

3. If there exists i G {h{t — I) + I,..., ht} such that S[ ^ 0, then Bob sends 
I to Alice, outputs any element of S\ and they terminate. Otherwise Bob 
sends 0 to Alice and they proceed to Stage t + 1. 


Let us at first show that the protocol terminates for all a in the support of /i. 
Assume that Alice has a and Bob has Let i = log 2 so that a G Si. 

The protocol terminates on Stage t where 


h{t — 1) + 1 < i < ht 


or earlier. Indeed all hash values of a available to Bob on Stage t coincide with 
hash values of some element of Si (for instance, with those of a). 

Thus Alice sends at most k + ht bits to Bob and Bob sends at most 1 + t 
bits to Alice. Therefore total communication is bounded by 


k T ht T iTt — k h(t — 1) -t-/i-t-2-t- (t — 1) 

i — 1 


^ k i — I + /1 + 2 + 


h 


<k + log2 


/i(a) 


{^) 


+ h + 0{l). 


Since k = [log 2 (-)] +1, the required bound follows. 

Now we bound the error probability. An error may occurs, if for some t a 
set Si considered on Stage t has an element b ^ a which agrees with hash values 
sent from Alice. At that time Bob has already k + ht > k + i hash values. The 


6 






probability that k + i hash values of b coincide with those of a is 2 ^ *. Hence 
by union bound error probability does not exceed 

OO OO OO 

\S,\2-’^-^ = 2-'=+! < 2“'=+^ 

i—0 i—0 i—0 x^Si 

= 2-'=+^ Y, k-{x) = 2”''+^ = 2-r>°S2(i)l < 

x^A 


□ 

Theorem 3.1. Let X, Y he jointly distributed random variables that take values 
in the finite sets X and y. Then for every positive e there exists a public-coin 
protocol with the following properties. 

• For every pair {x,y) from the support of {X,Y) with probability at least 
1 — e Bob outputs x; 

• The expected length of communication is at most 

H{X\Y) + 2^HiX\Y) + log2 + 0(1). 


Proof. On input x, y, Alice and Bob run protocol of Lemma 13.11 with A = X, 
h= , a = X and y, equal to the distribution of X, conditioned on 

the event Y = y. Notice that Alice knows a and Bob knows y. 

Let us show that both requirements are fulfilled for this protocol. The first re¬ 
quirement immediately follows from the first property of the protocol of Lemma 

o 

From the second property of the protocol of Lemma 13.11 it follows that for 
input pair x,y out protocol communicates at most: 


log2 


1 


Pr[A = x\Y = y] 


l0g2 ( 


Vi:[X=x\Y=y]^ 




^H{X\Y) +log2(-)+0(l) 


bits. Recalling that 

H[X\Y)= Y Pr[A = x,T = 2/]log2 


(x,y)&Xxy 


Pr[A = x\Y = y] 


we see on average the communication is as short as required. □ 

Remark. One may wonder whether there exists a private-coin communica¬ 
tion protocol with the same properties as the protocol of Theorem 13.11 New¬ 
man’s theorem ( 0 ) states that every public-coin protocol can be transformed 
into a private-coin protocol at the expense of increasing the error probability 
by S and the worst case communication by 0(log log \X x 3^| -|- log 1 /S) (for any 
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positive (5). Lemma |3.1l provides an upper bound for the error probability and 
communication of our protocol for each pair of inputs. Repeating the arguments 
from the proof of Newman’s theorem, we are able to transform the public-coin 
protocol of Lemma 13.11 into a private-coin one with the same trade off between 
the increase of error probability and the increase of communication length. It 
follows that for our problem there exists a private-coin communication protocol 
which errs with probability at most s and communicates on average as many 
bits as the public-coin protocol from Theorem 13. II pIus extra 0(loglog \X x 3^|) 
bits. 

4 One-shot Slepian-Wolf theorem with a con¬ 
stant number of rounds on average 

In this section, we modify the construction from the previous section to reduce 
the average number of rounds to a constant. 

Theorem 4.1. Let X, Y he jointly distributed random variables that take values 
in the finite sets X and y. Then for every positive e there exists a public-coin 
protocol with the following properties: 

• For every pair {x,y) from the support of {X,Y) with probability at least 
1 — e Bob outputs x; 

• The expected length of the protocol does not exceed 



• The expected number of rounds in protocol is at most f. 

(Compared to Theorem \3.1[ the number of rounds has decreased and the com¬ 
munication length has increased.) 

Proof. We will use the following notation: 




fi{x, y) = Pr[X = x,Y = y], ii{x\y) = Pr[X = x\Y = y]. 


Alice and Bob apply the following modification of the protocol of Lemma IXTl 
Recall that that protocol works in stages. On Stage 0 Alice sends to Bob k 
random hash bits and on each subsequent stage Alice sends to Bob extra h 
random hash bits. On each stage Bob looks for an element in all sets 


S. = {x' \yi{x'\y)G{2-^-\2-^]}. 


such that i is at least k less than the total number of hash bits he has so far. 
This guarantees that the error probability is at least e for all input pairs. 




Now Alice sends k + I hash bits on Stage 0 and 12* new hash bits on Stage 
t > 0. This is the main difference between the new protocol and the protocol 
of Theorem O In order to keep the error probability at most e, on Stage t 
Bob looks for an element in Si with the same hash values as sent by Alice for 
i < I + 21 + ... + 2*1. If there is such an element, then Bob outputs any such 
element (and sends 1 to Alice). Otherwise he sends 0 and they proceed to the 
next stage. 

As earlier, by union bound the error probability does not exceed 

OO OO CXD 

^ 15.12-'=-* = 2-'=+! 5] |5.|2-*-i < 2-'=+! 5] 5] ^^{x'\y) 

0 i—0 i—0 x'^Si 

= 2-'=+! ^ yL(x'\y) < 2-'=+! = 2 -r'°S 2 (i)l < e. 

x’eX 


Now we will estimate the communication length on each input pair (x, y) of 
positive probability. Bob sends one bit in each round. As we will see the average 
number of rounds is at most 4, thus we may forget about the communication 
from Bob and concentrate on communication from Alice. 

Set j = j{x,y) = log 2 j • Notice that x G Sj. Consider t such that 


1 + 21 + ... + 2*~^l <j <1 + 21 + ... + 2*1. 


( 2 ) 


By the construction of the protocol the communication length for input cc, y is 
at most 

k + I + 21 + ... + 2*1 
= k + l + 2{l + 2l + ... + 2*~^l) 

< k + I + 2j. 

Hence the expected length of communication from Alice to Bob is at most 

fJ-ix,y)ik + l + 2j{x,y)) 

{x,v+xxy 

<k + i + i Y. 

{x,v)GXxy 

= k + l + 2H{X\Y) = m{X\Y) + k + 0(1). 


Let us bound the expected number of rounds in our protocol. Let R{x, y) 
stand for the number for inputs X = x, Y = y. Then R{x, y) is at most 2t + 2, 
where t is defined by m- By © we have 


and hence: 


(2‘ - 1)1 <j< log2 


Kx\y) 


t < log2 
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Thus: 


R{x, y) <2 + 2 log 2 



By concavity of the logarithmic function the average number of rounds does not 
exceed: 


1 2 + 21og2 I 1 

(x,v)&xxy (x,y)^xxy 

< 2 + 21og2 I ^ ( 1 

\(x,y+Xy.y 

<2 + 21og2(2)=4. 


10g2 

log2 (jlRii)) 


□ 


5 One-round Compression 

Information complexity of the protocol tt with inputs {X,Y) is defined as 

ICf,{TT) = I{X : n|F, R) + I{Y : U\X, R) 

= i{x : n|y, R, Rb) + I{Y : n\X, R, Ra) 

= I{X : n, R, Rb\Y) + I(Y : B, i?, Ra\X), 

where R,Ra,Rb denote (shared, Alice’s and Bob’s) randomness, y stands for 
the distribution of {X, Y) and 11 stands for the concatenation of all bits sent in tt 
(n is called a transcript). The hrst term is equal to the information which Bob 
learns about Alice’s input and the second term is equal to the information which 
Alice learns about Bob’s input. Information complexity is an important concept 
in the Communication Complexity. For example, information complexity plays 
the crucial role in the Direct-Sum problem f [lOjl. 

We will consider the special case when tt is one-round. In this case Alice 
sends one message 11 to Bob, then Bob outputs the result (based on his input, 
his randomness, and Alice’s message) and the protocol terminates. Since Alice 
learns nothing, information complexity can be re-written as 

I = IC^i7T)=I{X :n\Y,R). 

Our goal is to simulate a given one-round protocol tt with another protocol 
T which has the same input space (X,Y) and whose expected communication 
complexity is close to I. The new protocol r may be many-round. The quality 
of simulation will be measured by the statistical distance. Statistical distance 
between random variables A and B, both taking values in the set V, equals 

5{A, B) = max |Pr[A G U] — Pr[i3 G U] \ . 
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One of the main results of [3] is the following theorem. 

Theorem 5.1. For every one-round protocol n and for every probability distri¬ 
bution pL there is a public-coin protocol r with expected length (with respect to p 
and the randomness of t) at most I + SvT + O (log 2 -) such that for each pair 
of inputs {x, y) after termination of r Bob outputs a random variable 11' with 
S ((n|X = x,Y = y), {n'\X = x,Y = y))< e. 

We will show that theorem 13.11 implies that we can replace 5'/l by about 
2'/l in this theorem. We want transmit Alice’s message 11 to Bob (who knows 
Y and his randomness R) in many rounds so that the expected communication 
length is small. By theorem id.ll this task can be solved with error e in expected 
communication 


H{n\Y, R) + 2^H{I1\Y,R) + O (^log2 . 


(3) 


Assume first that the original protocol tt uses only public randomness. Then 


I = I{X : n|y, R) = H{I1\Y, R) - H{n\X, Y, R) = H{n\Y, R). 
Indeed, H{I1\X, Y, R) = 0, since 11 is defined by X, R. Thus ([3]) becomes 

I + 2y/l + O ^log2 — j 


and we are done. 

Fortunately, by the following theorem from [2] we can remove private coins 
from the protocol with only a slight increase in information complexity. 

Theorem 5.2. there is a one-round public-coin protocol tt' with information 
complexity IC^{'k) < / + log 2 (/ + 0(1)) such that for each pairs of inputs (x, y) 
Bob outputs n' for which TY\X = x,Y = y and n|Ai = x,Y = y are identically 
distributed. 

Combining this theorem with our main result ftheorem l3.ll) . we obtain the 
following theorem. 

Theorem 5.3. there is a public-coin protocol t with expected length (with respect 
to p, and the randomness of t) at most 

I + log2(.f + 0(1)) + 2 a// + log2(/ + 0(1)) + O ^log2 

such that for each pair of inputs {x, y) in Bob outputs Ft' 
5 ((n|X = x,Y = y), {n'\X = x,Y = y))<e 
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6 A Lower Bounds for the Average-Case Com¬ 
munication 

Let {X,Y) be a pair of jointly distributed random variables. Assume that tt is 
a deterministic protocol to transmit X from Alice to Bob who knows Y. Let 
7r(A, Y) stand for the result output by the protocol tt for input pair (A, Y). We 
assume that for at least 1 — e input pairs this result is correct: 

Pr[7r(A,r)y^^)]<£- 

It is not hard to see that in this case the expected communication length can¬ 
not be much less than H{X\Y) bits on average. Moreover, this applies for 
communication from Alice to Bob only. 

Proposition 6.1. For every deterministic protocol as above the expected com¬ 
munication from Alice to Bob is at least H{X\Y) — elog 2 \X\ — 1. 

Proof. Indeed, let Ilyi denote the concatenation of all bits sent by Alice. If Bob’s 
input is fixed, then the set of all possible values of 11^ forms a prefix-free code. 
Hence 

E[|n^| \Y = y]>H{nA\Y = y) 

and therefore 


E|n^| = E^^yE [|n^| \Y = y]> Ey.^YH{nA\Y = y) = H{I1 a\Y). 

Consider I{X : HyilV). By definition I{X : n^jV) < H{IIa\Y). On the other 
hand we have 

I{X : UaIY) = H{X\Y) - HiX\Y, Ha). 

Notice that 7r(A, Y) is a function of Y and tta (Bob’s guess is based on Y and 
on bits received from Alice) and hence H{X\Y,IIa) < H{X\Tr{X,Y)). Since 
Pr[7r(A, Y) ^ X\ < e, from Fano inequality it follows that 

H{X\7TiX,Y)) < I+elog2|A|. 

Therefore EIHaI > H{X\Y) - elog^ \X\ - 1. □ 

There are random variables for which this lower bound is tight. For instance, 
let Y be empty and let X take the value x G {0,1}" with probability e/2"' (for 
all such x) and let X = (the empty string) with the remaining probability 
I — e. Then the trivial protocol with no communication solves the job with 
error probability e and H{X\Y) fv elog 2 \X\. 

In this section we consider the following question: are there a random vari¬ 
ables (A, T), for which for every deterministic communication protocol the ex¬ 
pected communication is significantly larger than H{X\Y), say close to the up¬ 
per bound H{X\Y) 2^/H{X\Y) log 2 (i) of Theorem 13. II .'' Notice that from 
the proof of the theorem id.ll it follows that there exists a deterministic protocol 
which transmits A from Alice to Bob using H{X\Y)-\-2^/H{X\Y)-\-0 (log 2 (^)) 
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bits on average and which guaranties that for at most e-fraction of inputs (with 
respect to the distribution of {X,Y)) the transmission is incorrect. Indeed, for 
any choice of randomness the communication on each pair of inputs is bounded 
by lemma 13.11 Thus we may fix random bits so that the error probability is at 
most e. 

Orlitsky showed that if no error is allowed and the support of {X,Y) is a 
Cartesian product, then every deterministic protocol must communicate H[X) 
bits on average. 

Lemma 6.1. Let (X, Y) be a pair of jointly distributed random variables whose 
support is a Cartesian product. Assume that tt is a deterministic protocol, which 
transmits X from Alice to Bob who knows Y and 


PrKx,y)y^x)] = o. 


Then the expected length of n is at least H{X). 


For the sake of completeness we provide a proof of this result in the Ap¬ 
pendix. The main result of this section states that there are random variables 
{X, Y) such that transmission of X from Alice to Bob with error probability e 
requires H{X\Y) -|- (log 2 (^)) bits on average. 

The random variables X, Y are specified by two parameters, 5 S (0, 1/2) and 
n G N. Both random variables take values in {0,1,. .., n} and are distributed as 
follows: Y is distributed uniformly in {0,1,..., n} and X = Y with probability 
1 — (5 and X is uniformly distributed in {0, 1,..., n} \ {X} with the remaining 
probability S. That is, 


Pr[X = i,Y = j] 


(1 -|-^(1 6 ij ) 

n+1 


where Sij stands for the Kronecker’s delta. Notice that X is uniformly dis¬ 
tributed on {0,1,..., n} as well. A straightforward calculation reveals that 


and 


Pri.v=iir=ii = = 1 

Pr[y = j\ n n 


H{X\Y) = {l-5) log2 + (5 log2 (^) = log2 n + 0(1). 


We will think of <5 as a constant, say 1/4. For one-way protocol we are able 
to show that communication length must be close to logn, which is about 1/5 
times larger than H{X\Y)-. 

Proposition 6.2. Assume that tt is a one-way deterministic protocol, which 
transmits X from Alice to Bob who knows Y and 


Pr[niX,Y) ^ X)] 
Then the expected length of tt is at least (l — ■ 


< e. 

1 ) log2(n-f 1 ) - 2. 
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Proof. Let S be the number of leafs in tt. For each j S {0,1,..., n} 


#{i G {0,l,...,n} |7r(i,j) = i} < S. 

Hence the error probability e is at least (n + 1 — 5')^. This implies that 

5>n(l-^)+l>(n+l) 

Let n(X) denote the leaf Alice and Bob reach in tt (since the protocol is one¬ 
way, the leaf depends only on X). The expected length of n(A) is at least 
iL(n). Let li,l 2 , ■ ■ ■ ,ls be the list of all leaves in the support of the random 
variable n(A). As X is distributed uniformly, we have 


Pr[n = k] > 


1 

n+1 


for all i. The statement follows from 


Lemma 6.2. Assume that pi,... ,pk, qi,... ,qk G (0,1) satisfy 

k 

2=1 


Vi G {1,...,A:} Pi>qi. 

Then 

k 1 ^ X 

V P, log 2 — > V g* log 2 -2. 

Pi Qi 

2=1 2=1 

The proof of this technical lemma is deferred to the Appendix. The lemma 
implies that 

> —^ log2(« -f 1) - 2 > (l - log2(n -b 1) - 2. 
n -b 1 Vo/ 

□ 


The next theorem states that for any fixed S every two-way determinis¬ 
tic protocol with error probability e must communicate about H{X\Y) -b (1 — 
(5)log2(l/e) bits on average. 

Theorem 6.1. Assume that tt is a deterministic protocol which transmits X 
from Alice o Bob who knows Y and 

Pv[TT{X,Y)^X)]<e. 

Then the expected length of tt is at least 

(1 - (5 - 5/n) log 2 ( ^ ) -b ((5 - 2e) log 2 (n -b 1) - 25. 

\e + o/nj 
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The lower bound in this theorem is quite complicated and comes from its 
proof. To understand this bound assume that <5 is a constant, say <5 = 1/4, 
and i Then H{X\Y) = (1/4) log 2 n + 0(1) and the lower bound 

becomes 

" i " i) + (V4 - 2£) log2(n + 1) - i 


Condition ^ < e implies that the hrst term is equal to 

(3/4)log2 0)-O(l). 

Condition e < implies that the seconds term is equal to 

(1/4) log2n- 0(1). 


Therefore under these conditions the lower bound becomes 

(1/4) log 2 n + (3/4) log 2 - 0(1) = H{X\Y) + (3/4) log 2 - 0(1). 

Proof. Let 11 = n(X, Y) denote the leaf Alice and Bob reach in the protocol tt for 
input pair {X, Y). As we have seen, the expected length of communication is at 
least the entropy H(n{X, Y)). Let li,... ,ls denote all the leaves in the support 
of the random variable n(Ar, y). The set {(cc, y) |n(a:,?/) =/i} is a combinatorial 
rectangle C {0,1,..., n} x {0,1,..., n}. Imagine {0,1,..., n} x {0,1,..., n} 
as a table in which Alice owns columns and Bob owns rows. Let hi be the 
height of Ri and Wi be the width of Ri. Let di stand for the number of diagonal 
elements in Ri (pairs of the form (j,/)). By definition of {X,Y) we have 


Pr[n(x,r) = 1,] 


(1 - 5)di ^ 5{hiWi - dj) 
n + 1 n{n + 1) 


(4) 


The numbers {Pr[n(X, T) = define a probability distribution over the 

set {1,2,..., S'} and its entropy equals iL([n(X, F)). Equation ([U represents 


this distribution as a weighted sum of the following distributions: 
and I I ■ That is. Equation ([i]) implies that 



{Pr[n = L]}f=i 


(1 - (5 - 6/n) 



{6 + 5/n) 


hiWi \ ^ 


Since entropy is concave, we have 
iL(n) = H ({Pr[n = k]}t,) 

> (l-5-(5/n)iL '^+id + S/n)H 


hiWi 1 \ 
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The lower bound of the theorem follows from lower bounds of the entropies of 
these distributions. 


A lower bound for H 


In each row of i?, there is at most 1 


element {x,y), for which TT{x,y) = x. The rectangle Ri consists of di diagonal 
elements and hence there are at least df — di elements (cc, y) in Ri for which 
7 r(x, y) ^ X. Summing over all i we get 




Sidf - di) 
^ n{n + 1 ) 


and thus 


E 

i=l 


di 


n + 1 


< 


e + S/n 


Since Renyi entropy is a lower bound for the Shannon entropy, we have 


/ 


H 


n + 1 


> l0g2 


\ 


v£(tb)b 


> l0g2 


£ + S/r 


In Ri, there are at most hi good pairs (for which tt works correctly). At most 


di of them has probability ^^ 7 ^. Hence 


n+1 ’ 


Pr[n = k,7r{X,Y)=X] < 


(1 - 5)di 5{hi - di) 


n + 1 


i{n + I) 


and 


1 - e < Pr[ 7 r(X, F) = AT] = ^ Prp = k, Tr{X, Y) = X] 


i=l 


^E 


(I - 5)di ^ 5{hi - dj) 
n + I n(n + I) 


= 1 — (5 — 8/n + 


n(n + 1 ) 




i=l 


The last inequality implies that 

s 

'^h^> {1 - £/S){n-\- 1)^. 

A lower bound for H J . Since hi < n + 1, we have 

(n + I)^ 


E hiWi f {n + 1)^ 


i=l 


h,: 


, hiWi 




(n + l)z 


= -log^nT I) + ^h. 


Wi 


^ ‘(n + 1 )- 


■l0g2 


(n + 1 )" 
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Obviously 


By lemma 151^ we get 



^ (^E ** j (^+1)2 log2 {{n + 1 )") - 2 


> (2 — 2e/5) log2(7T, + 1) — 2. 


Thus 


H 



□ 
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A The proof of Lemma 16.1 

Let X y.y stand for the support of {X,Y). Fix x G X. Consider the set of all 
possible leafs Alice and Bob may reach in tt when X = x. Let be the leaf of 
minimal depth from this set. Denote the depth of lx by d{lx). Notice that the 
expected length of the protocol tt is at least Ex^xd{lx)- 

Suppose that for some xi,X 2 G X, xi ^ X 2 we have Ix^ = 1x2- It means 
that there exists yi,y 2 G y such that when X = xi,Y = yi and when X = 
X 2 ,Y = y 2 Alice and Bob reach the same leaf Ix^^. From the rectangle property 
it follows that when X = xi,Y = 2/2 Alice and Bob reach Ix^ too. Hence when 
X = xi,Y = 2/2 and when X = X 2 ,Y = 2 / 2 , Bob outputs the same answer, 
which is contradiction. 

Thus lx defines bijection from the set of all possible values of X to some 
prefix-free set of binary strings. Hence Exr^xd{lx) > H{X). 


B The proof of Lemma 16.2 

The function f(x) = cclog 2 j increases on [0, e“^] and its maximum value is 
e“^log 2 e < 1. Indeed, 


fix) 



In a;) 


In (—) 

Vex/ 

In 2 


> 0 


k 

when X G [0, e~^]. Since ^ = 1, we have 

2=1 


#{i G {l,...,fc} \pi > e ^} < e. 


The left hand side of this inequality is an integer hence 
# {i G {1,..., A:} |pi > e“^} < 2. Thus we conclude 


k 

l0g2 

i=l 


1 

Pt 


V Pilog2 — 

^-1 P* 

Pi<e •*- 


V p*l0g2 — 

Pi>e ^ 


> 9*l0g2^+ ^ 0 

^ -1 ^ -1 
Pi<e ^ ^ 


> g,log2 

Pi<e~^ 


1 

9* 


+ Yl f9ilog2 

Pi>e-1 


1 

9* 


\ 1 
1 > ^9i log2-2. 

/ i=i 


C Random variables, for which Theorem 13.1 
may be tight 

We finish this paper with the example of random variables (A, Y), for which we 
believe that the upper bound from Theorem Id. II is tight. Let Hn be the n-th 
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harmonic number: 


1 

= ^ - = Inn + 0(1). 

fc=i ^ 

Let X take values in {1, 2,..., n} and Y take values in Sn, the set of all 
permutations of the set {1,..., n}. The distribution of X, Y is defined as follows: 

i,Y = (7]= -r. 

This formula implies that H{X\Y = a) does not depend on tr S and equals 

^log2(ii?n) logan , , s 

2^ —+ O(loglogn). 

i—1 ^ 

Thus H{X\Y) = + O(loglogn). 

We conjecture that every deterministic protocol, which transmits X from 
Alice to Bob who knows Y with error probability e < 1/ log 2 n, communicates 
at least _|_ n(.yiog 2 (n)) bits on average. 
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